66 research outputs found

    Improving Table Compression with Combinatorial Optimization

    Full text link
    We study the problem of compressing massive tables within the partition-training paradigm introduced by Buchsbaum et al. [SODA'00], in which a table is partitioned by an off-line training procedure into disjoint intervals of columns, each of which is compressed separately by a standard, on-line compressor like gzip. We provide a new theory that unifies previous experimental observations on partitioning and heuristic observations on column permutation, all of which are used to improve compression rates. Based on the theory, we devise the first on-line training algorithms for table compression, which can be applied to individual files, not just continuously operating sources; and also a new, off-line training algorithm, based on a link to the asymmetric traveling salesman problem, which improves on prior work by rearranging columns prior to partitioning. We demonstrate these results experimentally. On various test files, the on-line algorithms provide 35-55% improvement over gzip with negligible slowdown; the off-line reordering provides up to 20% further improvement over partitioning alone. We also show that a variation of the table compression problem is MAX-SNP hard.Comment: 22 pages, 2 figures, 5 tables, 23 references. Extended abstract appears in Proc. 13th ACM-SIAM SODA, pp. 213-222, 200

    Rectangular Layouts and Contact Graphs

    Get PDF
    Contact graphs of isothetic rectangles unify many concepts from applications including VLSI and architectural design, computational geometry, and GIS. Minimizing the area of their corresponding {\em rectangular layouts} is a key problem. We study the area-optimization problem and show that it is NP-hard to find a minimum-area rectangular layout of a given contact graph. We present O(n)-time algorithms that construct O(n2)O(n^2)-area rectangular layouts for general contact graphs and O(nlogn)O(n\log n)-area rectangular layouts for trees. (For trees, this is an O(logn)O(\log n)-approximation algorithm.) We also present an infinite family of graphs (rsp., trees) that require Ω(n2)\Omega(n^2) (rsp., Ω(nlogn)\Omega(n\log n)) area. We derive these results by presenting a new characterization of graphs that admit rectangular layouts using the related concept of {\em rectangular duals}. A corollary to our results relates the class of graphs that admit rectangular layouts to {\em rectangle of influence drawings}.Comment: 28 pages, 13 figures, 55 references, 1 appendi

    Restricted Strip Covering and the Sensor Cover Problem

    Full text link
    Given a set of objects with durations (jobs) that cover a base region, can we schedule the jobs to maximize the duration the original region remains covered? We call this problem the sensor cover problem. This problem arises in the context of covering a region with sensors. For example, suppose you wish to monitor activity along a fence by sensors placed at various fixed locations. Each sensor has a range and limited battery life. The problem is to schedule when to turn on the sensors so that the fence is fully monitored for as long as possible. This one dimensional problem involves intervals on the real line. Associating a duration to each yields a set of rectangles in space and time, each specified by a pair of fixed horizontal endpoints and a height. The objective is to assign a position to each rectangle to maximize the height at which the spanning interval is fully covered. We call this one dimensional problem restricted strip covering. If we replace the covering constraint by a packing constraint, the problem is identical to dynamic storage allocation, a scheduling problem that is a restricted case of the strip packing problem. We show that the restricted strip covering problem is NP-hard and present an O(log log n)-approximation algorithm. We present better approximations or exact algorithms for some special cases. For the uniform-duration case of restricted strip covering we give a polynomial-time, exact algorithm but prove that the uniform-duration case for higher-dimensional regions is NP-hard. Finally, we consider regions that are arbitrary sets, and we present an O(log n)-approximation algorithm.Comment: 14 pages, 6 figure

    Longitudinal Assessment of Gray and White Matter in Chronic Schizophrenia: A Combined Diffusion-Tensor and Structural Magnetic Resonance Imaging Study

    Get PDF
    Previous studies have reported continued focal gray matter loss after the clinical onset of schizophrenia. Longitudinal assessments in chronic illness, of white matter in particular, have been less conclusive

    ReCoil - an algorithm for compression of extremely large datasets of dna data

    Get PDF
    The growing volume of generated DNA sequencing data makes the problem of its long term storage increasingly important. In this work we present ReCoil - an I/O efficient external memory algorithm designed for compression of very large collections of short reads DNA data. Typically each position of DNA sequence is covered by multiple reads of a short read dataset and our algorithm makes use of resulting redundancy to achieve high compression rate

    Diffusion tensor imaging of frontal lobe white matter tracts in schizophrenia

    Get PDF
    We acquired diffusion tensor and structural MRI images on 103 patients with schizophrenia and 41 age-matched normal controls. The vector data was used to trace tracts from a region of interest in the anterior limb of the internal capsule to the prefrontal cortex. Patients with schizophrenia had tract paths that were significantly shorter in length from the center of internal capsule to prefrontal white matter. These tracts, the anterior thalamic radiations, are important in frontal-striatal-thalamic pathways. These results are consistent with findings of smaller size of the anterior limb of the internal capsule in patients with schizophrenia, diffusion tensor anisotropy decreases in frontal white matter in schizophrenia and hypothesized disruption of the frontal-striatal-thalamic pathway system
    corecore